A Novel Neighborhood‐Weighted Sampling Method for Imbalanced Datasets
نویسندگان
چکیده
The weighted sampling methods based on k-nearest neighbors have been demonstrated to be effective in solving the class imbalance problem. However, they usually ignore positional relationship between a sample and heterogeneous samples its neighborhood when calculating weight. This paper proposes novel neighborhood-weighted method named NWBBagging improve Bagging algorithm's performance imbalanced datasets. It considers center identifying critical samples. And parameter reduction is proposed combined into ensemble learning framework, which reduces parameters increases classifier's diversity. We compare with some state-of-the-art algorithms 34 datasets, result shows that achieves better performance.
منابع مشابه
Margin-Based Over-Sampling Method for Learning from Imbalanced Datasets
Learning from imbalanced datasets has drawn more and more attentions from both theoretical and practical aspects. Over-sampling is a popular and simple method for imbalanced learning. In this paper, we show that there is an inherently potential risk associated with the oversampling algorithms in terms of the large margin principle. Then we propose a new synthetic over sampling method, named Mar...
متن کاملImbalanced Datasets: from Sampling to Classifiers
Classification is one of the most fundamental tasks in the machine learning and data-mining communities. One of the most common challenges faced when trying to perform classification is the class imbalance problem. A dataset is considered imbalanced if the class of interest (positive or minority class) is relatively rare as compared to the other classes (negative or majority classes). As a resu...
متن کاملA Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets
and Applied Analysis 3 costs for the positive and negative classes, SVM can be extended to the cost-sensitive setting by introducing an additional parameter that penalizes the errors asymmetrically. Consider that we have a binary classification problem, which is represented by a data set {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x l , y l )}, where x i ⊂ R represents a k-dimensional data point and ...
متن کاملA Novel One Sided Feature Selection Method for Imbalanced Text Classification
The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of t...
متن کاملHandling imbalanced datasets: A review
Learning classifiers from imbalanced or skewed datasets is an important topic, arising very often in practice in classification problems. In such problems, almost all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. It is obvious that traditional classifiers seeking an accurate performance over a full range of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Chinese Journal of Electronics
سال: 2022
ISSN: ['1022-4653', '2075-5597']
DOI: https://doi.org/10.1049/cje.2021.00.121